To solve the problem of classification of unbalanced data sets and the problem that the general cost-sensitive learning algorithm can not be applied to multi-classification condition, an integration method of cost-sensitive algorithm based on average distance of K-Nearest Neighbor (KNN) samples was proposed. Firstly, according to the idea of maximizing the minimum interval, a resampling method for reducing the density of decision boundary samples was proposed. Then, the average distance of each type of samples was used as the basis of judgment of classification results, and a learning algorithm based on Bayesian decision-making theory was proposed, which made the improved algorithm cost sensitive. Finally, the improved cost-sensitive algorithm was integrated according to the K value. The weight of each base learner was adjusted according to the principle of minimum cost, obtaining the cost-sensitive AdaBoost algorithm aiming at the minimum total misclassification cost. The experimental results show that compared with traditional KNN algorithm, the improved algorithm reduces the average misclassification cost by 31.4 percentage points and has better cost sensitivity.
Traditional hypernetwork model is biased towards the majority class, which leads to much higher accuracy on majority class than the minority when being tackled on imbalanced data classification problem. In this paper, a Boosting ensemble of cost-sensitive hypernetworks was proposed. Firstly, the cost-sensitive learning was introduced to hypernetwork model, to propose cost-sensitive hyperenetwork model. Meanwhile, to make the algorithm adapt to the cost of misclassification on positive class, cost-sensitive hypernetworks were integrated by Boosting. The proposed model revised the bias towards the majority class when traditional hypernetwork model was tackled on imbalanced data classification, and improved the classification accuracy on minority class. The experimental results show that the proposed scheme has advantages in imbalanced data classification.